This article explains what the assessment and scores on the metadata mean.
About the metadata quality
How and why does umwelt.info assess metadata quality?
On umwelt.info there is a metadata quality assessment for each entry. This quality indicator is aimed at the data and information providers, as metadata is important for finding relevant information quickly and systematically. Furthermore, those interested in data can quickly assess whether a search result fulfils certain criteria. Important criteria are for example:
- easy reusability through the use of open licences,
- an open file format, or
- the possibility of automatically downloading a data set
The assessment of metadata quality is based on the four FAIR principles: Findability, Accessibility, Interoperability and Reusability. More about the four FAIR principles can be found here . The individual criteria are either binary (synonymous with yes or no), categorical (synonymous with e.g. unspecific to very specific) or continuous. Table 1 shows an overview of the quality assessment and can also be downloaded as a factsheet (see below).
| FAIR principle | Criteria | Rating System | Significance |
|---|---|---|---|
| Findability | Identification | Yes / No | Does an unique identifier exist for the entry? |
| Title | continuous | Is the title informative? | |
| Description | continuous | Is the description easy to read according to an readability index? | |
| Key words | continuous | How many key words match with names in the environmental thesaurus (UMTHES) | |
| Geospatial reference | no regional information, general regional name, coordinate-specific region, exact region name, punctual coordinates | How precise is the localisation? | |
| time reference | Yes / No | Is there any date or time range given? | |
| Accessibility | Reference | Not implemented yet | |
| Direct Access | Yes / No | Is there a direct link to the original content? | |
| Openly Available | Yes / No | Is a registration necessary when accessing the data? | |
| Interoperability | Machine-readable Data | Yes / No | Is an automated read-out of at least one resource (data set) possible? |
| Machine-readable Metadata | Yes / No | Is an automated read-out of the metadata possible? | |
| Media Type | Yes / No | Is the data format of at least one resource (data set) known? | |
| Open Data Format | Yes / No | Is the data format openly accessible (for example .CSV)? | |
| Reusability | Licence | No information, ambiguous licence, specific licence, specific and open licence | Is the licence specific and open? |
| Contact | Yes / No | Are contact information given? | |
| Publisher | Yes / No | Is the publisher known? |
First Example: groundwater measurement station
The example of groundwater measuring point 6504 (fetched on August 28th, 2025) illustrates how the metadata quality is being calculated.
The score value of each entry is composed of the average of the individual criteria:
- Identifier: No unique identifier available (0 points)
- Title: This score is assigned based on a transformer-based language model trained by us and indicates an informative title (80 points)
- Description: High readability based on the readability index (79 points)
- Keywords: Eight of nine keywords are present in the environmental thesaurus (89 points)
- Spatial reference: Exact geodata of the measuring point known (100 points)
- Temporal reference: Period of measurements known (100 points)
If the individual values are calculated, the findability of groundwater measuring point 6504 thus receives an average of 75 points.
We calculate accessibility, interoperability, and reusability in the same way.
Second example
Another example illustrates the functionality of categorical criteria on pollutant identification in animals from the Palatinate Forest. In contrast to the example 1, the spatial reference criteria only scores 50 points, as only an area (Palatinate Forest) and no point-referenced geodata are available. The licence scores 33 points, as an unknown licence is specified for the entry. A known but non-free licence would score 66 points, while the absence of a licence would result in 0 points.